GRANT  NUMBER:  DAMD17-94 - J-4509 


TITLE:  Massachusetts  Cancer  Control  Evaluation  Project 


PRINCIPAL  INVESTIGATOR:  Susan  T.  Gershman,  M.P.H.,  Ph.D. 


CONTRACTING  ORGANIZATION:  Massachusetts  Health  Research 

Institute 
Boston,  MA  02108 


REPORT  DATE:  October  1996 


TYPE  OF  REPORT:  Annual 


PREPARED  FOR :  Commander 

U.S.  Army  Medical  Research  and  Materiel  Command 
Fort  Detrick,  Frederick,  MD  21702-5012 

DISTRIBUTION  STATEMENT:  Approved  for  public  release; 

distribution  unlimited 


The  views,  opinions  and/or  findings  contained  in  this  report  are 
those  of  the  author (s)  and  should  not  be  construed  as  an  official 
Department  of  the  Army  position,  policy  or  decision  unless  so 
designated  by  other  documentation. 


13  1 


19910226  115 


REPORT  DOCUMENTATION  PAGE 


Form  Approved 
0MB  No.  0704-0188 


1.  AGENCY  USE  ONLY  fteawA/s/iW  12.  REPORT  DATE  3.  REPORT  TYPE  AND  DATES  COVERED 

_ I  October  1996  Annual  (23  Sep  95  -  22  Sep  96) 

4.  TITLE  AND  SUBTITLE  5.  FUNDING 'NUMBERS ' 

Massachusetts  Cancer  Control  Evaluation  Project 

DAMD17-94-J-4509 


12.  REPORT  DATE 

October  1996 


4.  TITLE  AND  SUBTITLE 


6.  AUTHOR(S) 


Susan  T.  Gershman,  Ph.D. 


7.  PERFORMING  ORGANIZATION  NAME(S)  AND  ADDRESS(ES) 

Massachusetts  Health  Research  Institute 
Boston,  MA  02108 


8.  PERFORMING  ORGANIZATION 
REPORT  NUMBER 


9.  SPONSORING/MONITORING  AGENCY  NAME(S)  AND  ADDRESS(ES) 
Commander 

U.S.  Army  Medical  Research  and  Materiel  Command 
Fort  Detrick,  MD  21702-5012 


10.  SPONSORING/MONITORING 
AGENCY  REPORT  NUMBER 


12a.  DISTRIBUTION  /  AVAILABILITY  STATEMENT 

Approved  for  public  release;  distribution  unlimited 


12b.  DISTRIBUTION  CODE 


13.  ABSTRACT  Wax//77t/m  200  ' 

The  Massachusetts  Cancer  Control  Evaluation  Project  demonstrates  how  data  from 
diverse  sources  can  be  integrated  and  analyzed  geographically  to  assess  cancer 
screening  efficacy  and  to  design  effective  interventions. 

^  project  activities  focused  on  examining  the  distribution  of  breast 
cancer  in  Massachusetts  and  throughout  the  US,  analyzing  census  demographic  data, 
and  compiling  data  sources  in  preparation  for  statistical  modeling. 

Year  2  activities  have  focused  upon  completion  of  the  statistical  model.  In 
addition  to  the  socioeconomic  variables  created  from  our  measurement  modeling  of  the 
census  tract  measures,  other  known  covariates  were  analyzed  in  Year  2.  Spatial  scan 
statistical  technigues  were  utilized  to  examine  the  distribution  of  stage  at 
diagnosis  of  breast  cancer  cases,  and  identify  areas  of  the  state  with  clusters  of 
late-stage  diagnoses.  Data  sets  analyzed  were  incorporated  into  a  mapping  software 
package,  allowing  the  user  to  view  geographic  representations  of  data  distributions. 
This  technique  can  be  used  to  identify  localities  in  need  of  cancer  screening 
services,  assess  the  sociodemographic  characteristics  of  the  localities'  residents, 
and  assist  public  health  officials  in  tailoring  interventions  for  these  populations. 

14.  SUBJECT  TERMS  - lis.  NUMBER  OF  PAGES - 

Breast  Cancer  cancer  control  na-n _ _ 53 


cancer  control  evaluation,  geocoding,  spatial 
scan  statistics,  multivariate  modeling 


lie.  PRICE  CODE 


17.  SECURITY  CLASSIFICATION 
OF  REPORT 

18.  SECURITY  CLASSIFICATION 
OF  THIS  PAGE 

19.  SECURITY  CLASSIFICATION 
OF  ABSTRACT 

20.  LIMITATION  OF  ABSTRACl 

Unclassified 

Unclassified 

Unclassified 

Unlimited 

NSN  7540-01-280-5500 


Standard  Form  298  (Rev.  2-89) 

Prescribed  by  ANSI  Std.  Z39-18 
298-102 


fobswobp 


Opinions,  interpretations,  conclusions  and  recommendations  are 
those  of  the  author  and  are  not  necessarily  endorsed  by  the  U.S. 
Army. 


_  Where  copyrighted  material  is  quoted,  permission  has  been 

obtained  to  use  such  material. 

_  Where  material  from  documents  designated  for  limited 

distribution  is  quoted,  permission  has  been  obtained  to  use  the 
material . 

_  Citations  of  commercial  organizations  and  trade  names  in 

this  report  do  not  constitute  an  official  Department  of  Army 
endorsement  or  approval  of  the  products  or  services  of  these 
organizations . 

_  In  conducting  research  using  animals,  the  investigator (s) 

adhered  to  the  "Guide  for  the  Care  and  Use  of  Laboratory 
Animals,"  prepared  by  the  Committee  on  Care  and  use  of  Laboratory 
Animals  of  the  Institute  of  Laboratory  Resources,  national 
Research  Council  (NIH  Publication  No.  86-23,  Revised  1985) . 

_  For  the  protection  of  human  subjects,  the  investigator (s) 

adhered  to  policies  of  applicable  Federal  Law  45  CFR  46. 

* 

A 

_  In  conducting  research  utilizing  recombinant  DNA  technology, 

the  investigator (s)  adhered  to  current  guidelines  promulgated  by 
the  National  Institutes  of  Health. 

_  In  the  conduct  of  research  utilizing  recombinant  DNA,  the 

investigator (s)  adhered  to  the  NIH  Guidelines  for  Research 
Involving  Recombinant  DNA  Molecules . 

_  In  the  conduct  of  research  involving  hazardous  organisms, 

the  investigator (s)  adhered  to  the  CDC-NIH  Guide  for  Biosafety  in 
Microbiological  and  Biomedical  Laboratories. 


^4— ^ 

PI  -  Si^ature 


Date 


TABLE  OF  CONTENTS 


INTRODUCTION . 1 

Purpose . 1 

Background . 1 

Previous  Work . 2 

BODY . . 2 

Methods . 2 

Measures . 3 

Statistical  Analysis . 4 

Software  Development . 4 

Results  and  Discussion . 5 

CONCLUSIONS . 13 

Future  Work . 14 

REFERENCES . 16 

APPENDICES: 

A.  Technical  and  Frmctional  Specifications  For  Software  Prototype . 18 

B.  Software  Prototype  Screens . 20 


LIST  OF  TABLES 


1 .  Proportion  of  Stage  1  Tracts  by  Selected  Sociodemographic  Variables 

from  the  1990  Census,  Massachusetts . 7 

2.  Proportion  of  Stage  3  Tracts  by  Selected  Sociodemographic  Variables 

from  the  1990  Census,  Massachusetts . 12 

LIST  OF  FIGURES 

1 .  Elements  of  a  Geographic  Information  System  (GIS)  for  Breast  Cancer 

Control  Evaluation . 3 

2.  Distribution;  Proportion  of  Stage  1  Cases  for  1165  Census  Tracts . 5 

3.  Tracts  in  the  Highest  Quartile  for  Proportion  of  Stage  1  Cancers 

(Green)  and  Tracts  in  the  Lowest  Quartile  for  Proportion  of  Stage  1 
Cancers  (Red) . 6 

4.  Distribution:  Proportion  of  Stage  3  Cases  for  Each  of  1 165  Census 

Tracts . 8 

5.  Display  of  Tracts  that  are  High  (Red)  and  Low  (Green)  in  Proportion  of 

Stage  3  Cases . 9 

6.  Most  Likely  Cluster  of  Tracts  with  Excesses  of  Stage  3  Cases . 10 

7.  Educational  Characteristics  by  Census  Tracts  &  Table  Correlating 

Census  Variables  with  the  Proportion  of  Stage  3  Cases . 12 

8.  Mammography  Sites  in  the  Lowell,  Massachusetts  Area . 1 3 


INTRODUCTION: 


Purpose 

This  study  describes  a  system  to  assess  the  efficacy  of  breast  cancer  screening.  Since 
direct  measures  of  screening  are  not  available,  this  project  uses  proxy  measures  based  on 
diagnostic  staging.  The  information  provided  by  this  system  can  be  used  by  public  health 
officials  not  only  to  identify  geographic  regions  where  screening  is  inadequate,  but  also  to 
identify  and  characterize  the  educational,  economic,  and  racial/ethnic  background  of  citizens 
residing  in  these  regions  and  to  tailor  interventions  to  fit  the  characteristics  of  the  local 
population.  The  system  for  conducting  such  an  assessment  would  also  provide  concomitant 
information  about  the  location  of  mammography  units,  display  data  geographically  on  maps,  and 
allow  for  querying  of  the  displayed  data  so  as  to  obtain  information  on  any  location. 

Background 

Breast  cancer  is  the  leading  cancer  among  Massachusetts  females  and  has  accounted  for 
30.9%  of  all  newly  diagnosed  cancer  cases  between  1982  and  1992.  Further,  there  has  been  an 
alarming  rate  of  increase  in  diagnosed  cases,  prompting  state  government  officials  in  May  1992 
to  declare  the  disease  an  epidemic.  Between  1982  and  1992,  the  age-adjusted  incidence  rate  has 
increased  30.3%,  from  90.0  cases  per  100,000  females  to  117.3  cases  per  100,000  females. 

Since  there  is  no  effective  primary  prevention  strategy  for  breast  cancer,  secondary 
prevention,  through  mammography  screening  and  early  detection,  remains  the  only  way  of 
controlling  breast  cancer  and  improving  survival.  Screening  has  been  shown  to  reduce  breast 
cancer  mortality  30  to  40%  among  women  aged  50  and  older  (Collette,  1992;  Shapiro,  1982; 
Habbema,  1986;  Chu,  1988).  A  large  scale  randomized  controlled  trial  in  Sweden  reported  a 
30%  reduction  in  breast  cancer  mortality  for  women  aged  40  or  older  attributable  to 
mammography  (Tabar,  1985  and  1992). 

Researchers  from  the  University  of  Massachusetts  Medical  Center  conducted  an 
assessment  of  the  effectiveness  of  a  multicomponent  intervention  in  two  communities  to  increase 
utilization  of  breast  cancer  screening  by  women  over  50  years  of  age  (Zapka,  1993).  They  found 
dramatic  improvement  in  both  intervention  and  control  groups  and  concluded  that  participation 
in  screening  was  a  rapidly  rising  secular  trend.  Our  efforts  were  directed  at  monitoring 
screening  efficacy  across  the  entire  state  of  Massachusetts. 

Our  surveillance  system  builds  upon  those  of  Kemer  (1984)  and  Andrews  (1994). 

Kemer  and  his  colleagues  examined  geographic  variation  in  disease  incidence  and  mortality  in 
relation  to  census  variables  in  an  attempt  to  target  screening  programs,  while  Andrews  and  his 
colleagues  combined  mortality  and  census  data  to  target  cancer  screening  programs  on  a 
geographic  basis.  Our  system  integrated  data  from  a  cancer  registry  with  data  from  the  census, 
along  with  other  health  information  such  as  location  of  mammography  screening  sites,  into  a 
single  geographical  information  system  (GIS).  Dangermond  (1990)  defined  a  geographical 
information  system  as  “an  organized  collection  of  computer  hardware,  software,  geographic  data 
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and  personnel  to  efficiently  capture,  store,  update,  manipulate,  analyze  and  display  all  forms  of 
geographically  referenced  information”.  Maguire  (1991)  argues  that  “it  is  the  ability  to  organize 
and  integrate  apparently  disparate  data  sets  together  by  geography  which  make  GIS  so  powerful. 
The  spatial  searching  and  overlay  operations  are  a  key  functional  feature  of  GIS.”  Some 
elements  of  the  GIS  used  in  this  study  are  diagrammed  in  Figure  1  and  described  below. 

Previous  Work 

Year  1  activities  focused  on  examining  the  distribution  of  breast  cancer  in  Massachusetts 
and  throughout  the  US.  Using  data  from  Massachusetts,  Connecticut,  California  and  the 
National  Cancer  Institute  s  Surveillance,  Epidemiology  and  End  Results  program,  we  explored 
trends  in  cancer  incidence,  staging,  mortality  and  mammography  screening,  and  began 
integration  of  these  data  sources.  Project  staff  also  analyzed  census  data,  prepared  population 
data  for  multiple  geographic  units  of  analysis  and  time  periods,  and  examined  correlations 
between  various  socioeconomic  factors.  Additionally,  we  compiled  a  master  file  of  data  sources 
in  preparation  for  developmental  modeling,  and  began  the  statistical  modeling. 


BODY: 

Methods 

Since  January  1,  1982  all  new  cases  of  cancer  diagnosed  in  Massachusetts  residents  have 
been  reported  to  the  Massachusetts  Cancer  Registry  (MCR),  a  Division  of  the  Massachusetts 
Department  of  Public  Health.  Each  report  to  the  registry  is  recorded  on  a  standardized  form  to 
obtain  comparable  information  from  case  to  case  about  the  type,  histology  and  stage  of  the 
disease.  Forms  also  include  demographic  information,  including  the  patient’s  age,  race, 
occupation,  smoking  status,  and  address  at  the  time  of  the  diagnosis.  For  this  study,  breast 
cancer  cases  diagnosed  between  1982  and  1992  were  aggregated  by  census  tract  and  integrated 
with  geographical  information,  such  as  the  location  of  1 177  census  tracts,  the  location  of  351 
minor  civil  divisions  (MCDs)\  the  location  of  296  mammography  machines  at  218  sites,  and 
the  boundary  files  for  each  of  27  Community  Health  Network  Areas  (CHNAs)^.  Breast  cancer 
data  were  aggregated  into  two  five-year  periods,  1982-1986  and  1987-1992.  While  data  from 
the  first  period  was  used  to  demonstrate  the  system  and  to  identify  areas  of  high  or  low  screening 
efficacy,  substantive  findings  and  the  consistency  of  those  findings  over  time  can  be  cross- 
validated  with  data  from  the  second  period.  As  diagrammed  in  Figure  1,  data  were  also 


*  MCDs  are  equivalent  to  the  351  incorporated  cities  and  towns  in  Massachusetts.  Although  the  data  in  this  paper 
have  been  aggregated  at  the  level  of  the  census  tract,  it  is  possible  to  disaggregate  further  to  the  block  group  level, 
or  aggregate  to  the  MCD  level. 

CHNAs  are  a  Massachusetts  Department  of  Public  Health  designation  for  aggregations  of  cities  and  towns. 
CHNAs  are  used  to  develop  health  networks  consisting  of  consortia  of  health  care  providers,  human  service 
agencies,  schools,  churches,  advocacy  groups  and  members  of  the  public  of  all  ages.  These  networks  identify  and 
assess  health  needs  in  their  communities,  and  evaluate  responses  to  these  needs.  The  major  foci  of  the  networks 
are  increasing  access  to  care,  efficiency  of  health  services,  and  communication  and  collaboration  among  health 
care  and  human  services  providers  in  the  area. 
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extracted  from  the  1990  Census  so  that  tracts  could  be  characterized  according  to  a  variety  of 
social,  economic,  and  demographic  indicators,  such  as  educational  attainment,  race/ethnicity,  per 
capita  income,  employment  levels,  and  the  distribution  of  occupational  categories. 

Measures 


Accurate  measures  of  mammography  screening  are  not  generally  available.  A  Wisconsin 
study  compared  levels  of  mammography  screening  using  data  from  the  Behavioral  Risk  Factor 
Surveillance  System  (BRFSS)  to  data  from  records  of  mammography  sites.  The  two  data 
sources  showed  similar  trends,  but  large  and  consistent  discrepancies  in  terms  of  the  actual 
number  of  mammograms  performed  (Lantz,  1995).  Estimates  of  screening  from  BRFSS  data 
consistently  overestimated  rates  of  screening  by  about  20%  as  compared  to  data  obtained  from 
the  mammography  sites. 


Since  direct  measures  of  screening  are  generally  not  available,  this  project  uses  proxy 
measures.  One  proxy  measure  suggested  by  Roffers  and  Austin  (1993)  is  based  upon  the 
proportion  of  cases  diagnosed  at  an  in  situ  or  localized  stage.  Boss  and  Suarez  (1990)  also 
suggested  using  the  ratio  of  in  situ  diagnoses  to  all  invasive  cases  as  a  measure  to  evaluate 
screening  programs.  Roffers  and  Austin  maintain  that  if  at  least  5%  and  up  to  15  or  20%  of 
newly  diagnosed  cases  are  in  situ  for  a  community,  mammography  screening  can  be  judged  as 
satisfactory.  The  MCR  began  collecting  data  on  cases  diagnosed  at  the  in  situ  stage  in  1992 
(previously,  only  invasive  cancers  were  required  to  be  reported).  Stages  used  in  analysis  are: 
Stage  1  (localized  disease).  Stage  2  (regional  spread  of  disease),  or  Stage  3  (remote  or  metastatic 
disease). 
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We  chose  to  aggregate  cases  and  data  at  the  census  tract  level  because  use  of  higher  levels,  such 
as  towns  or  CHNAs,  would  mask  the  broad  variation  found  within  towns  and  within  CHNAs. 

We  computed  the  proportion  of  female  breast  cancer  cases  in  each  census  tract  in  Massachusetts 
diagnosed  at  Stage  1,  Stage  2,  or  Stage  3.  Assuming  that  earlier  diagnosis  reflects  better 
screening,  tracts  with  higher  proportions  of  Stage  1  cases  (localized  disease  at  diagnosis)  were 
seen  as  having  better  screening  than  tracts  with  lower  proportions  of  Stage  1  cases.  Conversely, 
census  tracts  with  higher  proportions  of  Stage  3  cases  (remote  or  metastatic  disease  at  diagnosis) 
were  seen  as  providing  poorer  levels  of  screening  than  tracts  with  lower  proportions  of  Stage  3 
cases. 

Statistical  Analysis 

A  variety  of  univariate  statistical  methods  were  used  to  describe  the  occurrence  of 
cancer  within  a  region  or  across  the  state  of  Massachusetts,  and  to  describe  social,  economic  and 
demographic  variables.  Bivariate  relationships  were  analyzed  using  chi-square  and  Pearson 
correlations;  we  also  used  polychoric  and  polyserial  correlations  to  study  associations,  but  do  not 
report  those  analyses  here.  The  relationships  between  cancer  data  and  sets  of  social,  economic 
and  demographic  variables  were  examined  in  a  variety  of  ways,  including  multiple  regression 
analysis  and  discriminant  function  analysis.  We  utilized  spatial  scan  statistics  techniques 
(Kulldorf,  1994)  to  test  whether  certain  geographical  regions  contained  clusters  or  excess 
numbers  of  Stage  1  or  Stage  3  cases.  Spatial  scan  statistics  determine  whether  the  higher 
numbers  of  Stage  1  or  Stage  3  cases  occurring  in  some  regions  exceed  the  number  of  cases 
attributable  to  chance  variation.  Regions  with  statistically  significant  excesses  of  Stage  1  cases 
could  be  viewed  as  screening  more  effectively,  and  regions  with  statistically  significant  excesses 
of  Stage  3  cases  could  be  seen  as  deficient  in  their  screening  programs. 

Software  Development 

Both  technical  and  functional  specifications  for  the  software  prototype  have  been 
outlined  in  Appendix  A.  Project  staff  determined  that  it  would  be  important  to  incorporate  a 
mapping  capability  into  any  application  software,  and  evaluated  mapping  software  packages 
such  as  Maptitude  and  Mapinfo.  The  software  prototype  will  produce  files  which  can  be 
imported  into  the  above  referenced  mapping  software.  The  files  will  provide  information  (such 
as  incidence  rates,  staging  distributions,  age  compositions,  and  racial/ethnic,  education,  and 
economic  variable  distributions)  which  will  be  displayed  with  the  selected  geographic  areas. 

Appendix  B  depicts  software  interfaces  which  have  already  been  designed.  Additional 
work  that  needs  to  be  accomplished  includes: 

(1)  complete  porting  of  user  interface; 

(2)  finish  query  definition  modules; 

(3)  finish  asynchronous  query  submission  modules; 

(4)  complete  implementation  of  basic  statistics,  such  as  rates  and  proportions; 

(5)  display  the  result  sets  on  the  screen  in  table  format;  and 

(6)  debug  and  test. 
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Results  and  Discussion 


Figure  2  shows  the  distribution  of  the  proportion  of  Stage  1  cases  for  the  1 165  tracts 
reporting  at  least  one  case  of  breast  cancer  between  1982  and  1986.  (Twelve  of  the  1177  census 
tracts  had  no  cases  during  that  period,  leaving  1165  available  for  analysis.)  The  distribution  of 
the  proportion  of  Stage  1  cases  ranges  from  zero  for  29  tracts  to  a  high  of  1.0  for  17  tracts,  with 
most  tracts  closer  to  the  upper  and  lower  sides  of  the  mean  proportion  of  .52  cases.  While 
Figure  2  informs  us  that  there  is  great  variability  among  tracts  with  respect  to  the  proportion  of 
cases  diagnosed  at  Stage  1,  it  conveys  no  information  about  geographic  variability.  Are  some 
regions  consistently  higher  or  lower  with  respect  to  the  proportion  of  Stage  1  cases  diagnosed 
within  those  tracts? 

In  order  to  view  the  data  geographically  in  a  way  that  would  highlight  the  top  25%  of 
tracts  versus  the  bottom  25%  of  tracts,  we  grouped  all  tracts  into  one  of  three  categories:  1)  the 
lowest  quartile  -  this  included  tracts  where  the  proportion  of  Stage  1  cases  was  less  than  or  equal 
to  0.4286;  2)  the  middle  50%  -  this  included  tracts  where  the  proportion  of  Stage  1  cases  was 
greater  than  0.4286  but  less  than  0.6250,  and  3)  the  highest  quartile  -  this  included  tracts  where 
the  proportion  of  Stage  1  cases  was  equal  to  or  greater  than  0.6250.  Figure  3  displays  tracts 
from  the  lowest  and  highest  quartiles,  colors  red  and  green  respectively,  and  the  middle  50%  in 
black.  It  does  appear  that  there  are  clusters  of  red  tracts  (lowest  quartile  of  stage  1  diagnoses), 
which  might  suggest  that  those  tracts  are  doing  a  poorer  job  of  screening,  especially  in 


Figure  2.  Distribution;  Proportion  of  Stage  1  Cases  for  1165  Census  Tracts. 
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comparison  to  what  appear  to  be  clusters  of  green  tracts  (highest  quartile  of  stage  1  diagnoses). 
The  black  tracts  represent  the  middle  50%  and  might  be  seen  as  about  average.  It  would  be  a 
mistake  to  draw  firm  conclusions  at  this  point  however,  because  it  is  known  that  there  may  be 
relatively  few  cases  in  some  tracts.  Therefore,  confident  characterization  of  any  given  tract,  even 
any  given  cluster  of  tracts,  requires  that  such  instability  be  taken  into  account. 


Figure  3.  Tracts  in  the  Highest  Quartile  for  Proportion  of  Stage  1  Cancers  (Green)  and  Tracts  in 
the  Lowest  Quartile  for  Proportion  of  Stage  1  Cancers  (Red). 


Note;  The  mapping  software  used  to  produce  these  figures  makes  use  of  color  to  distinguish  data  categories.  In 
this  figure,  the  scissor  symbols  are  green,  the  pencil  symbols  are  red,  and  the  circled  symbol  is  black. 


We  used  spatial  statistics  to  adjust  and  account  for  the  variability  and  instability 
introduced  by  tracts  with  small  numbers  of  cases.  Kulldorff  s  spatial  scan  statistic  was  applied 
to  determine  whether  there  are  clusters  of  tracts  with  excess  numbers  of  Stage  1  cases  (above  the 
numbers  that  might  be  expected  due  to  normal  statistical  variation).  For  each  tract,  the  actual 
number  of  Stage  1  cases  for  that  tract  and  neighboring  tracts  was  compared  to  what  might  be 
expected  given  the  number  and  distribution  of  Stage  1  cases  for  the  entire  state.  The  definition 
of  neighboring  tracts  is  continually  enlarged  in  multiple  statistical  trials  to  include  up  to  10%  of 
the  total  population  of  Stage  1  cases.  The  spatial  scan  statistic  revealed  no  statistically 
significant  clusters  of  tracts  with  excess  numbers  of  Stage  1  cases.  Thus,  looking  again  at  Figure 
3,  if  there  appear  to  be  clusters  of  green  tracts,  those  clusters  are  only  apparent  and  can  be 
attributed  to  normal  variation. 
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Another  question  suggested  by  Figure  3  deals  with  the  relationship  between  the 
educational,  racial/ethnic,  and  economic  variables  and  the  proportion  of  Stage  1  cases.  The  data 
reveal  low,  but  statistically  significant  (p<.01)  correlations.  There  are  negative  correlations 
between  the  proportion  of  Stage  1  cases  and  several  census  variables:  the  proportion  Black 
(-.17),  Hispanic  (-.13),  unemployed  (-.15),  and  the  proportion  with  less  than  a  ninth  grade 
education  (-.09).  The  correlations  are  positive  between  the  proportion  of  Stage  1  diagnoses  and 
the  proportion  of  college  graduates  (.11),  as  well  as  per  capita  income  (.11). 

Another  way  of  examining  the  nature  of  the  relationship  between  the  socioeconomic  and 
demographic  variables  and  the  proportion  of  Stage  1  cases  was  to  perform  a  discriminant 
function  analysis.  In  this  analysis  we  divided  the  tracts  into  the  three  aforementioned  categories 
-  the  highest  quartile,  middle  50%,  and  the  lowest  quartile  of  Stage  1  diagnoses.  The  analysis 
consists  of  a  multivariate  test  whereby  all  of  the  above  socioeconomic  and  racial/ethnic  variables 
are  taken  together,  adjusting  for  the  known  correlations  and  dependencies  between  pairs  of 
variables,  to  determine  whether  there  are  consistent  differences  among  the  three  groups  on  these 
variables.  Table  1  shows  the  means  for  each  of  the  census  variables  for  each  of  the  Stage  1 
categories.  The  tracts  in  the  lowest  quartile  were  the  referent  group,  with  comparisons  being 
made  between  this  and  the  middle  and  high  quartiles.  All  comparisons  were  statistically 
significant  (p<.0001)  and  in  the  expected  direction.  Tracts  with  the  lowest  proportion  of  Stage  1 
cases  were  higher  in  the  percent  of  those  with  less  than  nine  years  of  education,  lower  in  the 
percent  of  college  graduates,  higher  in  the  percent  of  blacks  and  Hispanics,  higher  in  the  percent 
unemployed,  and  lower  in  per  capita  income  when  compared  to  either  of  the  other  two 
categories.  Clearly  socioeconomic  variables  can  discriminate  between  tracts  with  high  and  low, 
and  between  tracts  with  medium  and  low  proportions  of  Stage  1  groups.  We  did  not  test  the 
difference  between  medium  and  high  because  only  two  statistical  comparisons  are  permissible, 
but  a  casual  comparison  would  suggest  little  difference  between  the  middle  and  high  group  on 
socioeconomic  and  racial/ethnic  variables.  Socioeconomic  and  racial/ethnic  variables  are 
important  correlates  of  screening  utilization. 


Table  1.  Proportion  of  Stage  1  Tracts  bv  Selected  Sociodemographic  Variab1e.s  from  the 
1990  Census,  Massachusetts 

Proportion 
Stage  1 

<  9  yrs  ed 

College 

Grads 

Black 

Hispanic 

Unempfoyed 

Per  capita 
Income 

Lowest 

25% 

12.05% 

22.39% 

11.08% 

8.53% 

8.83% 

$14969.26 

Middle 

50% 

8.50% 

27.40% 

4.79% 

5.00% 

7.06% 

$17609.83 

Highest 

25% 

9.35% 

27.73% 

5.63% 

5.32% 

7.10% 

$17144.12 
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While  an  examination  of  census  tracts  according  to  the  proportion  of  Stage  1  cases 
diagnosed  provides  some  insights  into  whether  tracts  are  doing  well  or  poorly  with  respect  to 
detecting  cases  early,  an  examination  of  tracts  according  to  the  proportion  of  Stage  3  cases  may 
reveal  whether  tracts  are  high  or  low  with  respect  to  the  proportion  of  cases  diagnosed  at  a 
distant  stage.  Figure  4  shows  the  distribution  of  tracts  according  to  the  proportion  of  Stage  3 
cases  diagnosed  in  residents  of  each  tract.  The  distribution  is  quite  skewed,  with  most  tracts 
showing  a  low  proportion  of  Stage  3  cases.  Nevertheless  there  is  variability  and  a  long  tail,  with 
some  tracts  showing  a  relatively  high  proportion  of  Stage  3  cases.  As  noted  earlier,  12  tracts  had 
no  breast  cancer  cases,  leaving  1165  census  tracts  available  for  analysis. 

As  with  the  proportion  of  Stage  1  cases,  tracts  were  divided  into  three  categories;  1)  the 
lowest  quartile  -  tracts  with  no  cases  diagnosed  as  Stage  3,  2)  the  middle  50%  -  tracts  where 
there  was  at  least  one  case  diagnosed  at  Stage  3,  but  where  the  proportion  was  less  than  0. 125, 
and  3)  the  highest  quartile  -  tracts  where  the  proportion  of  cases  diagnosed  at  Stage  3  was  0.125 
or  more. 


Figure  4.  Distribution:  Proportion  of  Stage  3  Cases  for  Each  of  1 165  Census  Tracts. 


Proportion  of  cases  that  are  Stage  3 

Cases  from  1 982-1 986  aggregated  to  level  of  census  tract 
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Besides  the  aggregated  view  of  late  stage  at  diagnosis  (Figure  4),  we  can  view  the  data 
geographically  (Figure  5).  While  Figure  4  shows  that  most  cases  are  not  diagnosed  at  Stage  3,  it 
provides  no  information  about  the  location  of  tracts  where  the  proportion  of  Stage  3  diagnoses  is 
relatively  high,  and  no  information  on  whether  there  are  clusters  of  tracts  with  high  proportions 
of  Stage  3  diagnoses. 

In  Figure  5,  the  green  tracts  are  in  the  lowest  quartile  with  respect  to  the  proportion  of 
Stage  3  cases,  while  the  red  tracts  are  in  the  highest  quartile,  indicating  a  high  proportion  of 
Stage  3  cases.  This  may  indicate  areas  of  poor  screening.  Are  there  regions  of  the  state  where 
there  are  excessive  numbers  of  Stage  3  cases  relative  to  the  total  number  of  Stage  3  cases?  We 
again  applied  KulldorfF  s  spatial  scan  statistic,  comparing  the  number  of  Stage  3  cases  in  each 
tract  to  the  number  of  cases  expected  if  only  chance  variation  were  operating  (e  g.,  the 
proportion  of  Stage  3  cases  expected  across  similar  groups  of  tracts).  In  this  instance,  there  were 
1433  cases  diagnosed  at  Stage  3  out  of  a  total  of  18,627  cases  for  the  five-year  period  (1982- 
1986).  The  spatial  scan  statistical  analysis  revealed  four  overlapping  clusters  with  excessive 
numbers  of  Stage  3  cases.  Of  the  four  clusters  identified,  one  is  displayed  in  Figure  6. 


Figure  5.  Display  of  Tracts  that  are  High  (Red)  and  Low  (Green)  in  Proportion  of  Stage  3 
Cases. 


a> 


Note:  The  mapping  software  used  to  produce  these  figures  makes  use  of  color  to  distinguish  data  categories.  In 
this  figure,  the  green  and  red  symbols  have  both  reproduced  as  black. 


9 


For  the  tracts  within  the  cluster  circled  in  Figure  6,  there  were  a  total  of  23  breast  cancer 
cases  of  which  10  were  diagnosed  at  Stage  3.  To  determine  the  degree  of  excess,  we  first 
calculate  the  expected  number  of  cases  for  that  area  as  the  product  of  the  number  of  cases  for 
that  region  multiplied  by  the  expected  proportion  of  Stage  3  cases  (the  total  number  of  Stage  3 
cases  divided  by  the  total  number  of  cases  diagnosed  statewide): 

23*(1433/18627)=1.77 

The  excess  is  then  10/1.77=5.65;  that  is,  there  were  5.65  times  as  many  cases  as  expected,  which 
is  465%  above  what  we  would  expect  if  only  chance  were  operating.^  There  are  three  additional 
clusters  with  excesses  of  Stage  3  cases  varying  from  49%  to  600%.  Since  these  clusters  overlap. 
Dr.  Kulldorf  recommends  that  we  report  only  one  cluster,  the  most  likely,  and  points  out  that  “it 
is  a  good  illustration  ...  of  the  fact  that  we  cannot  determine  the  exact  location  and  shape  of  any 
detected  cluster,  but  only  the  general  location.”"* 


Figure  6.  Most  Likely  Cluster  of  Tracts  with  Excesses  of  Stage  3  Cases. 


Note;  The  mapping  software  used  to  produce  these  figures  makes  use  of  color  to  distinguish  data  categories.  In 
this  figure,  the  green  and  red  symbols  have  both  reproduced  as  black. 


^  Thanks  to  Dr.  Martin  Kulldorf  from  the  National  Cancer  Institute  for  these  calculations  and  the  use  of  his 
statistical  software. 

Kulldorf,  personal  communications.  May  1996. 
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Having  identified  the  most  likely  cluster  of  tracts  with  excesses  of  Stage  3  cases,  the 
system  can  be  queried  for  additional  information  such  as  the  location  of  mammography  sites,  or 
the  economic,  racial/ethnic,  educational,  or  occupational  characteristics  of  the  people  living  in 
these  tracts.  As  an  example.  Figure  7  shows  the  educational  characteristics  of  those  living  in  the 
Lowell,  MA  area,  previously  identified  as  a  cluster  of  tracts  with  an  excess  of  Stage  3  cases. 
(Each  pie  chart  represents  one  census  tract.)  The  table  inset  within  Figure  7  shows  the 
correlations  between  selected  census  variables  and  the  proportion  of  Stage  3  cases  within  the 
region  where  the  clusters  were  identified.  There  are  statistically  significant  (p<.05)  correlations 
between  selected  census  variables  and  the  proportion  of  Stage  3  cases  within  the  Lowell  region. 
The  correlations  are  positive  for  the  proportion  with  less  than  nine  years  of  education  (.32),  the 
proportion  who  are  black  (.33),  the  proportion  who  are  Hispanic  (.46),  and  the  proportion  who 
are  unemployed  (.33);  the  correlations  are  negative  between  the  proportion  of  stage  3  cases  and 
the  proportion  with  four  years  of  college  (-.33)  and  per  capita  income  (-.28).  All  correlations  are 
statistically  significant  at  p<.05,  except  the  correlation  between  proportion  of  Stage  3  cases  and 
the  proportion  Hispanic,  which  is  significant  at  p<0.01,  and  per  capita  income,  which  does  not 
reach  significance  at  p<.05. 

While  Figure  7  illustrates  how  each  of  these  census  measures,  such  as  educational  level, 
can  be  displayed  geographically,  we  can  also  perform  the  more  traditional  discriminant  function 
analysis  to  determine  how  well  the  socioeconomic  and  racial/ethnic  measures  separate  the  lowest 
quartile  and  the  middle  50%  of  tracts  from  the  tracts  in  the  highest  quartile  of  Stage  3  cases. 

We  grouped  tracts  into  three  categories  based  upon  the  distribution  of  the  proportion  of 
Stage  3  cases.  Since  there  were  441  tracts  with  zero  Stage  3  cases,  the  bottom  category 
contained  37.5%  of  the  tracts,  the  middle  category  contained  the  next  36.6%,  and  the  highest 
category  contained  25.6%  of  the  tracts.  Table  2  displays  the  means  for  the  census  variables,  and 
shows  that  the  tracts  with  the  highest  proportion  of  Stage  3  cases  have  the  highest  percentage  of 
people  with  a  ninth  grade  education  or  less,  the  lowest  percentage  of  college  graduates,  the 
highest  percentage  of  blacks  and  Hispanics,  the  highest  unemployment  rate,  and  the  lowest  per 
capita  income.  All  differences  between  the  lowest  and  highest  groups  are  statistically  significant 
(p<.02)  except  the  proportion  of  blacks  in  these  tracts.  All  of  the  differences  between  the  middle 
tracts  and  the  highest  reach  univariate  statistical  significance  for  all  variables  (p<.00005). 
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Table  2.  Proportion  of  Stage  3  Tracts  by  Selected  Sociodemographic  Variables  from  the 
1990  Census,  Massachusetts 

Proportion 
of  Stage  3 

<  9  yrs  ed 

College 

Grads 

Black 

Hispanic 

Unemployed 

Per  capita 
Income 

Lowest 

25% 

9.76% 

26.75% 

8.18% 

6.47% 

7.62% 

$16765.96 

Middle 

50% 

7.31% 

28.89% 

3.67% 

3.74% 

6.52% 

$18233.26 

Highest 

25% 

12.28% 

21.57% 

8.43% 

8.43% 

8.79% 

$14892.95 

Figure  8  shows  the  geographical  distribution  of  mammography  sites  in  the  Lowell  area. 
It  would  appear  that  the  mammography  sites  are  somewhat  remote  from  the  primary  cluster  of 
high  Stage  3  census  tracts.  The  geographical  information  system  data  base  also  includes  roads 
and  railroads,  and  provides  a  basis  for  determining  how  accessible  these  sites  are  to  residents  of 
the  high  Stage  3  census  tracts.  It  may  be  that  a  combination  of  site  location  and  access  to  public 
transportation  deter  participation  in  screening  programs  for  certain  residents. 


Figure  7.  Educational  Characteristics  by  Census  Tracts  &  Table  Correlating  Census  Variables 
with  the  Proportion  of  Stage  3  Cases. 

Education  Level  - 1990  Census 
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CONCLUSIONS: 

This  study  demonstrates  how  data  from  diverse  sources  can  be  integrated  and  analyzed 
geographically  to  assess  screening  efficacy.  This  system  can  be  used  by  public  health  officials  to 
monitor  breast  cancer  screening  in  particular  areas,  and  could  be  easily  adapted  to  monitor  other 
kinds  of  cancers  and  cancer  screening  activities.  In  our  demonstration  we  identified  a  specific 
geographical  area  with  a  higher  proportion  of  late-stage  breast  cancer  diagnoses  than  the  rest  of 
the  state.  Assuming  that  the  same  pattern  is  found  with  data  from  1987  through  1992,  those 
responsible  for  conducting  screening  programs  within  that  area  might  be  alerted  as  to  the  need 
for  more  effective  screening.  Furthermore,  concomitant  information  about  the  region  from  the 
census  could  be  helpful  in  designing  effective  interventions. 

The  socioeconomic  and  racial/ethnic  associations  with  early  and  late  stages  of  diagnosis 
are  not  new  (Farley,  1989);  what  is  new  is  being  able  to  single  out  a  particular  geographical 
region  with  statistically  significant  excesses  and  immediately  access  the  related  socioeconomic 
and  racial/ethnic  characteristics  of  that  region  and  put  that  information  into  the  hands  of 
intervention  planners.  It  is  known  that  interventions  work  better  if  they  take  target  population 
characteristics  into  account. 
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It  should  be  noted  that  in  this  study  only  proxy  measures  were  available  for 
mammography.  RofFers  and  Austin  (1993)  suggest  that  the  measure  “percent  in  situ  of  all  cases” 
can  reflect  frequency  of  mammography  screening  and  the  measure  “percent  localized  of  all 
invasive  cases  of  known  stage”  may  reflect  frequency  of  manual  screening.  It  is  entirely 
possible  to  incorporate  actual  mammography  utilization  data  as  it  becomes  available.  Use  of 
such  data  would  allow  for  a  better  assessment  of  the  relationships  between  sociodemographic 
characteristics,  utilization  of  mammography,  and  stage  at  diagnosis. 

While  analyses  in  this  study  were  conducted  at  the  level  of  the  census  tract,  it  is  also 
possible  to  aggregate  at  lower  levels,  for  example,  at  the  census  block  group  level  as  Krieger 
demonstrated  in  her  San  Francisco  study  (1992).  Such  finer  analyses  may  be  needed  in  urban 
areas,  while  analysis  at  higher  levels  of  aggregation,  such  as  towns  (MCDs),  or  even  CHNAs 
might  be  appropriate  for  certain  kinds  of  studies. 

It  should  be  recognized  that  cases  are  assigned  to  census  tracts  on  the  basis  of  the 
patient’s  address  recorded  at  the  time  of  diagnosis.  Problems  in  the  address  fields  may  occur 
when  the  patient  provides  a  business,  mailing,  temporary  or  care/of  address  rather  than  usual 
residence  address.  Such  address  problems  introduce  errors  in  assigning  correct  census  tracts.  In 
addition,  the  geocoding  process  itself  introduces  tracting  errors  through  mistakes  in  the  reference 
GIS  data.  Examples  would  include  inexact  alignment  of  street-level  data  overlain  on  census 
tract  boundaries,  misnumbered  buildings,  misnamed  streets,  inverted  block  numbering,  and 
missing  building  numbers  and  street  names.  Such  errors  in  the  reference  GIS  data  may  also  lead 
to  misassigned  census  tracts. 

Another  caution  entails  the  assignment  of  socioeconomic  variables  based  on  tracts  to 
aggregated  cases  within  those  tracts.  Patients  within  a  tract  may  not  be  typical  of  other  residents 
within  those  tracts.  Krieger  (1992),  however,  has  found  that  the  use  of  socioeconomic  data,  at 
least  at  the  level  of  the  tract  and  block  group,  is  generally  not  misleading,  and  consistent  with  the 
findings  of  others,  and  probably  underestimates  the  effects  that  would  have  been  observed  were 
individual-level  data  available. 

Future  Work 

In  the  process  of  integrating  these  diverse  data  sources  and  methods  of  analysis,  we 
deliberately  focused  on  breast  cancer  data  from  the  years  1982  through  1986,  using  data  from 
that  period  to  test  the  system.  In  this  way,  substantive  findings  could  then  be  cross-validated 
with  data  from  1987  through  1992  and  beyond.  While  we  were  conducting  these  analyses,  we 
found  errors  in  geocoding  that  need  to  be  addressed  and  corrected.  We  have  almost  completed 
making  these  corrections  on  the  1982-1986  data.  In  addition,  we  have  found  problems  with  the 
1987  to  1992  data,  and  are  correcting  these  problems  as  well.  During  the  extension  period,  we 
will  first  need  to  reproduce  the  analyses  on  the  corrected  1982  through  1986  data  and  ensure  that 
our  substantive  findings  are  correct.  We  will  next  cross-validate  our  work  using  the  1987  to 
1992  breast  cancer  data  to  determine  whether  substantive  findings  from  the  earlier  period  are 
stable  across  time  periods  or  whether  findings  vary  from  location  to  location. 
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The  above  work  has  been  conducted  at  the  level  of  the  census  tract.  Other  work  within 
the  Massachusetts  Department  of  Public  Health  aggregates  to  the  town  level.  We  will  want  to 
address  the  issue  of  level  and  compare  findings  from  tract  level  to  findings  at  town  level.  For 
some  geographical  areas,  especially  urban  centers,  we  hope  to  explore  the  use  of  block  group 
level  units  of  analysis. 

As  noted  previously,  the  MCR  only  began  to  collect  data  on  in  situ  cases  in  1992.  We 
plan  to  conduct  studies  using  in  situ  data  from  1992  to  examine  surveillance  efficacy.  Data  from 
1993  and  1994  will  also  be  available  during  the  extension  period,  and  it  may  be  necessary  to 
combine  in  situ  data  from  these  three  years  in  order  to  conduct  spatial  scans.  Throughout  the 
extension  period,  we  shall  continue  to  analyze  concomitantly  census  data,  mammography  site 
data,  and  other  relevant  data  as  they  become  available. 

During  Year  3,  we  plan  to  consult  with  Dr.  Nancy  Krieger  at  the  Harvard  School  of 
Public  Health,  an  expert  on  using  socioeconomic  data  from  the  census.  We  also  plan  to  maintain 
close  contact  with  professional  geographers  such  as  Gerard  Rushton  from  the  University  of  Iowa 
and  Ellen  Cromley  from  the  University  of  Connecticut,  and  leading  spatial  statisticians  such  as 
Martin  Kulldorf  from  the  National  Cancer  Institute  and  Joseph  Glaz  from  the  University  of 
Connecticut. 

The  GIS-based  surveillance  system  being  developed  should  provide  useful  information, 
based  upon  the  integration  of  diverse  data  sets  into  a  system  capable  of  up-to-date  spatial  display 
and  statistical  analyses  using  both  spatial  and  traditional  statistical  techniques.  Sharing  the 
system  with  potential  users  as  we  progress  should  also  provide  suggestions  for  practical 
improvement.  This  system  should  be  able  to  evaluate  the  effectiveness  of  breast  cancer 
screening  programs,  and  help  to  target  those  areas  that  would  benefit  from  additional  screening. 
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APPENDIX  A.  TECHNICAL  AND  FUNCTIONAL  SPECIFICATIONS  FOR 
SOFTWARE  PROTOTYPE 

TECHNICAL  SPKCTFTCATTDNS 


Software  Developer  Technical  Requirements: 

•  Microsoft  Windows  95  and  Windows  NT  Workstation  (v.3 .5 1  and  up)  are  the  target 
operating  systems 

•  Microsoft  Windows  Network  based  on  Windows  NT  Server  v.3. 51 

•  Microsoft  Visual  Basic  v.4.0  Enterprise  Edition  is  the  software  development  environment 

•  Microsoft  SQL  Server  Relational  Database  is  the  database  system  utilized 

•  Microsoft  SQL  Server  Workstation  Edition  is  the  database  development  environment 

•  SPSS  Developers  Kit  for  Windows  provides  the  statistical  libraries  that  will  be  embedded 
into  the  system 

•  SylanMaps\OCX  for  Windows  provides  the  mapping  libraries  necessary  for  embedding  into 
the  system 


Technical  Specifications/Requirements/Features; 

•  32-bit  based  software  prototype 

•  Remote  OLE  Automation  Server 

•  Export  capability  (exporting  of  query  results) 

•  Creation  of  re-useable  classes 

•  RDO  will  be  the  chief  class  based  database  API  used  to  connect  to  MS  SQL  Server 

•  Windows  95  or  Windows  NT  (v.3. 5 1  and  up) 

•  User  interface  will  be  Windows  95/NT  v.4.0  compliant 

•  16MB  of  RAM 

•  Color  VGA  monitor  (640  x  480  minimum  video  resolution) 

•  Microsoft  compatible  mouse 

•  Maximum  storage  is  estimated  at  1 0MB  of  available  of  hard  disk  (for  software  program 
files) 

•  MS  SQL  based  database  will  be  resident  on  Windows  NT  Server  v.3 .5 1  (MCR’s  server) 

•  Modem  (for  remote  users  only)  capable  of  28.8  bps 


Data  to  be  incorporated  into  the  database: 

•  Population  denominators 

•  Socioeconomic  variables 

•  Cancer  incidence  (aggregate) 

•  Cancer  mortality  (aggregate) 

•  Behavioral  Risk  Factor  Surveillance  System  data 

•  Mammography  sites 
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FUNCTIONAL  SPECTFTCATTONS 


Since  reliable  methods  for  measuring  breast  cancer  screening  are  not  yet  available,  surrogate  or 
proxy  measures  of  screening  might  be  useful  until  better  measures  are  developed.  Two  such 
measures  are:  (1)  the  proportion  of  breast  cancer  cases  in  each  census  tract  that  are  diagnosed  as 
Stage  1,  and  (2)  the  proportion  of  breast  cancer  cases  diagnosed  as  in  situ.  The  software  will 
incorporate  these  measures  along  with  other  relevant  concomitant  data  such  as  the  number  and 
location  of  mammography  sites,  and  the  racial/ethnic,  educational,  economic  and  age 
compositions  of  each  geographic  area. 

Data  files: 

File  la:  Census  tract  data  (1177  tracts)  -  1990  Census  data,  which  will  include  race/ethnicity, 

education  and  economic  variables 

File  lb:  City/town  data  (351  cities/towns)  -  1990  Census  data,  which  will  include 

race/ethnicity,  education  and  economic  variables 
File  2:  Massachusetts  Cancer  Registry  data  -  breast  cancer  incidence  for  1982-1992  by 

census  tract,  age  and  stage 

File  3:  Population  data  -  four-digit  census  tracts  and  their  respective  1980  and  1990 

populations  for  1 8  five-year  age  groups,  by  sex 
File  4:  Mammography  site  data  -  address,  census  tract  and  number  of  machines 

Sample  output: 

•  By  census  tracts,  towns,  CHNAs  -  proportion  of  cases  diagnosed  in  situ.  Stage  1,  Stage  2 
and  Stage  3 

•  Cutoffs  for  low,  medium  and  high  proportions 

•  Associations  with  SES  variables 

Graph:  Distribution  of  proportion  of  stage  in  situ.  Stage  1  and  Stage  3  cases  for  each  census 

tract 

Map:  Map  low,  medium,  high  proportion  of  Stage  1  and  Stage  3.  Highlight  low/high 

proportions  of  in  situ  and  Stage  1 

Statistics:  Age-specific  rates,  age-adjusted  rates,  standardized  incidence  ratios 
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APPENDIX  B 


! 


SOFTWARE  PROTOTYPE  SCREENS 

MCR/DOD  BCCES  Software  Development 
For  Software  Prototype 


I 

DoD  BCCES  Software  Prototype  Screens:  j 
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Please  make  a  selection. 


iSuery  Window  Help 


Select  Diagnosis  Year 
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